High-dimensional pharmacogenetic prediction of a continuous trait using machine learning techniques with application to warfarin dose prediction in African Americans

نویسندگان

  • Erdal Cosgun
  • Nita A. Limdi
  • Christine W. Duarte
چکیده

MOTIVATION With complex traits and diseases having potential genetic contributions of thousands of genetic factors, and with current genotyping arrays consisting of millions of single nucleotide polymorphisms (SNPs), powerful high-dimensional statistical techniques are needed to comprehensively model the genetic variance. Machine learning techniques have many advantages including lack of parametric assumptions, and high power and flexibility. RESULTS We have applied three machine learning approaches: Random Forest Regression (RFR), Boosted Regression Tree (BRT) and Support Vector Regression (SVR) to the prediction of warfarin maintenance dose in a cohort of African Americans. We have developed a multi-step approach that selects SNPs, builds prediction models with different subsets of selected SNPs along with known associated genetic and environmental variables and tests the discovered models in a cross-validation framework. Preliminary results indicate that our modeling approach gives much higher accuracy than previous models for warfarin dose prediction. A model size of 200 SNPs (in addition to the known genetic and environmental variables) gives the best accuracy. The R(2) between the predicted and actual square root of warfarin dose in this model was on average 66.4% for RFR, 57.8% for SVR and 56.9% for BRT. Thus RFR had the best accuracy, but all three techniques achieved better performance than the current published R(2) of 43% in a sample of mixed ethnicity, and 27% in an African American sample. In summary, machine learning approaches for high-dimensional pharmacogenetic prediction, and for prediction of clinical continuous traits of interest, hold great promise and warrant further research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

Comparison of two different techniques of warfarin dosing determination - A chemometrics study

A high prevalence of genetic polymorphisms increases sensitivity to warfarin therapy. In this study, we investigated 47 patients with effective long-term therapy by warfarin well-controlled by monitoring of International Normalised Ratio (INR). All patients were tested for gene polymorphisms VKORC1, CYP2C9*C2, and CYP2C9*C3, which were used for a dose calculation employing a program www.Warfari...

متن کامل

Race influences warfarin dose changes associated with genetic factors.

Warfarin dosing algorithms adjust for race, assigning a fixed effect size to each predictor, thereby attenuating the differential effect by race. Attenuation likely occurs in both race groups but may be more pronounced in the less-represented race group. Therefore, we evaluated whether the effect of clinical (age, body surface area [BSA], chronic kidney disease [CKD], and amiodarone use) and ge...

متن کامل

Comparison of two different techniques of warfarin dosing determination - A chemometrics study

A high prevalence of genetic polymorphisms increases sensitivity to warfarin therapy. In this study, we investigated 47 patients with effective long-term therapy by warfarin well-controlled by monitoring of International Normalised Ratio (INR). All patients were tested for gene polymorphisms VKORC1, CYP2C9*C2, and CYP2C9*C3, which were used for a dose calculation employing a program www.Warfari...

متن کامل

Comparison of Nine Statistical Model Based Warfarin Pharmacogenetic Dosing Algorithms Using the Racially Diverse International Warfarin Pharmacogenetic Consortium Cohort Database

OBJECTIVE Multiple linear regression (MLR) and machine learning techniques in pharmacogenetic algorithm-based warfarin dosing have been reported. However, performances of these algorithms in racially diverse group have never been objectively evaluated and compared. In this literature-based study, we compared the performances of eight machine learning techniques with those of MLR in a large, rac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 27 10  شماره 

صفحات  -

تاریخ انتشار 2011